Skip to content

Conversation

@clementval
Copy link
Contributor

When an asynchronous allocation is made, we call cudaMallocAsync with a stream. For deallocation, we need to call cudaFreeAsync with the same stream. in order to achieve that, we need to track the allocation and their respective stream.

This patch adds a simple sorted array of asynchronous allocations. A binary search is performed to retrieve the allocation when deallocation is needed.

@clementval clementval requested a review from wangzpgi April 23, 2025 22:11
@clementval clementval merged commit 565a075 into llvm:main Apr 24, 2025
9 checks passed
@clementval clementval deleted the stream_alloc branch April 24, 2025 17:01
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…on (llvm#137073)

When an asynchronous allocation is made, we call `cudaMallocAsync` with
a stream. For deallocation, we need to call `cudaFreeAsync` with the
same stream. in order to achieve that, we need to track the allocation
and their respective stream.

This patch adds a simple sorted array of asynchronous allocations. A
binary search is performed to retrieve the allocation when deallocation
is needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants